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Abstract — This paper considers the queueing performance of 
a system that transmits coded data over a time-varying erasure 
channel. In our model, the queue length and channel state 
together form a Markov chain that depends on the system 
parameters. This gives a framework that allows a rigorous 
analysis of the queue as a function of the code rate. Most prior 
work in this area either ignores block-length (e.g., fluid models) or 
assumes error-free communication using finite codes. This work 
enables one to determine when such assumptions provide good, 
or bad, approximations of true behavior. Moreover, it offers a 
new approach to optimize parameters and evaluate performance. 
This can be valuable for delay-sensitive systems that employ short 
block lengths. 

I. Introduction 

Forward error-correcting codes have played an instrumental 
role in the many successes of digital communications over 
the past decades JT]. The fact that it is possible to transmit 
digital information reliably at a positive rate over an unknown 
noisy channel is now universally acknowledged [2|. The main 
cost of improving reliability is the use of increasingly long 
codewords [3|. One situation where the valuable lessons of 
classical coding theory may not apply directly is the general 
area of delay-constrained communications. If system speci- 
fications dictate that almost all information bits should be 
made available at the destination shortly after they arrived 
at the transmitter, it may not be possible to aggregate a 
large number of them before encoding and transmission. In 
some cases, stringent delay requirements will force a system 
designer to resort to short block codes or short constraint- 
length convolutional codes. 

From a coding perspective, using short codewords on chan- 
nels with memory creates two impediments. First, decoders 
are designed to correct the most-likely error patterns and 
the probability of seeing atypical error patterns cannot be 
neglected for short block lengths. Second, if the coherence 
time of the channel is longer than a codeword transmission 
interval, then optimal code rate may depend heavily on the 
channel state, which is unknown to the transmitter. Together, 
these factors impair the rapid transmission of information. 

Coding performance as a function of block-length and code- 
rate has been assessed in the information theory literature 
using the reliability function |3|. This criterion focuses on the 
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exponential rate at which the error probability decays with 
block length, known as the error exponent, as a function of 
information rate. The concept of a reliability function can 
also be extended to variable-length codes in the presence of 
feedback [4]. More recently, consideration has been given to 
the reliability function for bits with fixed delay, as opposed to 
coded blocks, in the presence of feedback jSl . 

While remarkable, these results remain asymptotic in nature 
and do not necessarily capture overall system behavior ade- 
quately. For delay-sensitive applications and short codewords, 
three interrelated effects come into play. The probability of 
decoding failure for every codeword is not negligible. Packet 
retransmissions lead to queue buildups at the source and, 
thereby, induce longer latencies. Channel correlation over time 
introduces dependencies among successive decoding attempts, 
which further perturb queueing behavior and end-to-end delay. 
This is especially true when decoding failures are likely to 
occur in sequence [6|. Thus, a queueing analysis is necessary 
when considering the behavior of communication systems 
subject to very stringent delay requirements. 

For delay-sensitive systems with short codewords, the nat- 
ural tradeoff between code-rate and probability of decoding 
failure is hard to characterize Q. In a non-asymptotic regime 
where information is queued at the source, transmitting data 
at a rate slightly below Shannon capacity may lead to poor 
performance. Recent results in the literature hint at the fact 
that, for delay-constrained communication, optimal code-rate 
selection depends heavily on block-length and channel corre- 
lation HI, (9). These findings are especially important for real- 
time traffic and live interactive sessions, as these applications 
are sensitive to latency and require the use of short codewords. 

Guidelines for code-rate selection in the context of delay- 
sensitive traffic were previously obtained for an erasure chan- 
nel with memory iflOll . The approach favored therein, which 
permits a complete characterization of queueing behavior, 
consists in building a Markov model for the evolution of 
the system. Crucial assumptions that facilitate analysis can 
be summarized as follows: the packet arrival process at the 
source is Bernoulli, the packet lengths are i.i.d. geometric, the 
error protection uses random codes, and the channel evolution 
is governed by a Markov chain. 

In this article, we adopt a similar formulation and extend 
results that were obtained for the correlated erasure case to a 
more encompassing Gilbert-Elliot framework. This latter class 
of erasure channels is common to the literature on channels 
with memory, and subsumes earlier work based on similar 




Fig. 1. A Gilbert-Elliot bit erasure channel is employed to model the 
operation of a communication link with memory. This model captures both 
the uncertainty associated with transmitting bits over a noisy channel and 
correlation over time typical of several communication channels. 

concepts. We also present an in-depth analysis of system 
performance using different criteria that reflect the needs of 
various contemporary applications. This research is significant 
because it offers a new perspective on the selection of code- 
rate and block-length for delay-sensitive systems and provides 
a rigorous investigation into the effects of time-correlation on 
the queued performance of real-time wireless connections. 

II. Channel Abstraction and Coding 

Throughout, we assume that coded bits are sent from the 
transmitter to the destination over a Gilbert-Elliot erasure 
channel. This channel can be in one of two states: a good 
state g in which every bit is erased with probability e g and 
a bad state b in which every bit is erased with probability 
£;,, independently of other bits. Our naming scheme implies 
£b > £g- Transitions between channel states occur according 
to a Markov process. The probability of transitioning to state g 
given that the Markov chain is currently in state b is denoted 
by a. The likelihood of the reverse transition from g to b 
is symbolized by (3. Under alphabetical state ordering, the 
parameters of this Markov chain can be expressed in the form 
of a probability transition matrix, 



A graphical interpretation of the communication channel under 
consideration appears in Fig. [T] 

The state of the channel at time n is a random variable, 
which we denote by C n . Moreover, the succession of states 
over time, {C n : n £ N}, forms a Markov process. Finding 
the conditional probability Pr(C„+i = d\C n = c) amounts to 
selecting an entry in P. Likewise, Pr(C„+jv = d\C n = c) can 
be obtained by locating the corresponding entry in P , the 
Ath power of P. We note that this Markov chain converges to 
its stationary distribution at an exponential rate that depends 
on the second eigenvalue of P (i.e. 1 — a — p). 

In our analysis, a packet of length L is sectioned into M 
data segments each containing K information bits. Packing 
loss is treated implicitly since the last data segment of each 
packet is zero padded to K bits. Every segment is encoded 
separately into a codeword of length A, which is subsequently 
stored in the queue for eventual transmission over the Gilbert- 
Elliot erasure channel. Decoding failures are handled through 
immediate retransmission of the missing data. 



A. Distribution of Erasures 

A quantity that is of fundamental importance in our analysis 
is the conditional probability of decoding failure at the desti- 
nation. An intermediary step in identifying this probability is 
to derive an expression for E, the number of erasures within a 
codeword of length A. This, in turn, depends on the number 
of visits to each state within A consecutive realizations of the 
channel. More specifically, we are interested in conditional 
probabilities of the form 

PT(E = e,C N+1 = d\d =c), (2) 

where e G No and c, d £ {b, g}. The generating function 
for these conditional probabilities is based on generalizing the 
entries of P to the vector space of real polynomials in x with 

p (1 - «)(1 - £fe + SbX) a(l-£ b + £ b x) 

x ~[ (3{l-e g + £ g x) (1 - P)(l - e g + e g x) 

Let [a^'J be the operator which maps a polynomial in x to 
the coefficient of xK Then, the conditional probability (O is 
given, in terms of the Ath power of P x , by 

Pr(E = e, C N+1 = d|d = c) = [z e ] [Pf ] c d . 

It is worth mentioning that one can employ this method or 
alternative combinatorial means to obtain closed-form expres- 
sions for the desired conditional probabilities ifTTl . ifTUl . 

B. Probability of Decoding Failure 

During every transmission, a segment of K information bits 
is encoded using a code defined by a random parity-check 
matrix H of size (N — K) x K, where each matrix entry is 
selected independently and uniformly from {0, 1}. Maximum 
likelihood decoding is used at the destination. 

Random coding has the benefit that the probability of 
decoding failure depends only on the number of erasures 
and not on the locations of the erasures. Consequently, the 
decoding failure probability is a function of the number of 
erasures E in the block. Once the value of E is known, we 
can derive the desired probability as follows. Conditioned on 
E = e, decoding at the destination will succeed if and only if 
the submatrix of H formed by choosing the e erased columns 
has rank e lfl2l . Furthermore, the probability that a random 
e x p matrix over Fa, where p = N — K stands for the 
number of parity bits, has rank e is equal to 11^=0 (■"• — 2 J-p ). 
Thus, given e erasures within a codeword of length A, the 
probability of decoding failure can be written as 

e-l 

Pi(N-K, e) = 1 - 1] ( X - . 

The average probability of decoding failure at the destina- 
tion is therefore equal to P f (N - K) = E [P { (A- K, E)], 
where the expectation over E depends implicitly on all pos- 
sible channel realizations within a block. While the average 
probability of decoding failure offers a good measure of per- 
formance, it alone does not capture the queueing behavior of 
the system. Indeed, correlation among decoding-failure events 
may also alter the behavior of the queue at the transmitter. 



III. Arrival and Departure Processes 

Having introduced a precise model for the physical layer, we 
turn to the description of the arrival and departure processes 
at the queue. In our framework, the block-length, which we 
denote by N, remains fixed throughout and every codeword 
transmission requires N consecutive uses of the channel. 
Each data packet is broken into length-X data segments 
that are separately encoded into blocks. In terms of system 
characterization, TV is fundamental in that it determines the 
sampling period of our Markov chain. 

We assume that the packet arrival process is i.i.d. Bernoulli 
with parameter 7. This implies that, during each codeword 
transmission interval, a new packet arrives at the source 
with probability 7. The number of bits in each data packet 
is assumed to be an i.i.d. random process whose marginal 
distribution is geometric with parameter p. Therefore, the 
probability that a packet contains exactly I bits becomes 

Pl (L = £) = (l- P y- 1 p £=1,2,... 

where p g (0, 1). These assumptions on the structure of the 
arrival process and the packet-length distribution are crucial 
for the construction of a tractable Markov model for our 
communication system. They enable a rigorous analysis of the 
queue and lead to meaningful guidelines for system design and 
optimization. 

Departures from the queue are governed by the underlying 
Gilbert-Elliot channel and the design-rate r = K/N of our 
random linear code. The number of information bits contained 
in every codeword is therefore K = rN. A low-rate code will, 
in general, have a smaller probability of decoding failure than 
the same system with a higher rate code. Still, the successful 
decoding of a codeword associated with a high-rate code 
leads to the transmission of a larger amount of data bits. 
These competing considerations create a natural tradeoff be- 
tween information content and probability of decoding failure. 
Accordingly, the code-rate r, or equivalently the number of 
information bits K, is a parameter that should be optimized. 

Once a code rate is selected, the number of successfully 
decoded codewords needed to complete the transmission of a 
given packet is M = \L/rN~\. Since L is geometric, we find 
that M also has a geometric distribution, albeit with parameter 

rN 

p r = j2(i-p) e - 1 p = i-(i-p) rN - 
£=1 

The probability that a data packet requires the successful 
transmission of m data segments of size rN is equal to 

Pr(M = m) = (1 - Pr) m ~ l p r m = 1, 2, . . . 

For a head packet to depart from the queue, the destination 
must successfully decode the most recent codeword it received, 
and this codeword must carry the final segment of information 
corresponding to this packet. Implicit to our system model is 
the ability of the destination to acknowledge the reception of a 
codeword through instantaneous feedback. Based on this side 
information, the transmitter is able to remove data segments 
and packets from the queue after successful transmission. 



IV. Queueing Behavior 

The number of data packets in the queue at the onset of 
block s is denoted by Q s . The state of the Gilbert-Elliot 
channel at this same instant is represented by C s n+i- Together, 
these two quantities form the state of our Markov process, 
U s = (C S N+i, Qs)- We emphasize that the cardinality of this 
state space is countable, with U s belonging to {b, g} x No. 
Furthermore, the Markov chain underlying the evolution of 
our system possesses a special structure; it forms an instance 
of a discrete-time quasi-birth-death process. Fortunately, there 
are many established techniques to study such mathematical 
objects. We present one possible approach in Section HV-AI 

The transition probability from U s to U s +i is given by 

Pv(U s+1 = (d,q s+1 )\U s = (c,q s )) 
= ^2 Pr(Q s+ i = q s +i\E = e,Q s = q s ) X ^ 

e6N 

Pr (E = e, C\ s+ i)N+i = d|C sJV +i = c) . 

Recall that a methodology was introduced in Section IH-AI to 
derive the distribution of (E,C( s +i)n+i) conditioned on the 
value of CsN+i - Obtaining expressions for probabilities of the 
type Pr (Q s +i = q*+i\E = e, Q s = q s ) remains. 

We first consider conditional events {Q s = q s } for which 
q s > 0; admissible values for Q s +i are then limited to values 
in {q s — l,q s ,q s + 1}- Two factors can affect the length of 
the queue, the arrival of a new data packet and the completion 
of a packet transmission. The latter occurrence will only take 
place if a codeword is successfully decoded at the destination 
and the head packet has no additional data segment left at the 
source. Keeping these facts in mind, we get 

Pr (Qs+i = q s + l\E = e, Q s = q s ) 

= 7 (P f (N- K, e) + (1 - Pf(N—K, e))(l - Pr )) 
Pr (Qs+i =q s \E = e, Q s = q.) = 7(1- P{{N-K, e)) p r 

+ (1 - j)(P { (N-K, e) + (1 - P{(N—K, e))(l - p r )) 
Y>v(Qs+i = q s - 1\E = e,Q s = q s ) 

= (l- 7 )(l-P f (iV-if,e))p r . 

When the queue is empty, {Q s = 0}, only two possibilities 
can occur, 

Pr(Q s+1 = l|£ = e,Q s = 0)= 7 
Pr(Q s+1 = 0\E = e, Q s = 0) = 1 - 7. 

Collecting these findings and using we get the probability 
transition matrix of the Markov process {Us}- A graphical 
rendition of the state transitions appears in Fig. [2] 

To proceed with the analysis of our queued system, a 
compact representation of the conditional probabilities defined 
in (O is apropos. For q 6 N and c,d G {b,g}, we introduce 
the following mathematical notation, 
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When the queue is empty, the relevant submatrices become 



Fig. 2. State space and transition diagram for the aggregate queued process 
{U s }', self-transitions are intentionally omitted. 



Similarly, when the queue is empty, we use = Pt(U s +i = 
{d,0)\U a = (c,0)) and X° cd = Pv(U s+1 = (d,l)\U s = (c,0)). 
Collectively, these labels define the 12 transition probabilities 
associated with a non-empty queue, and the 8 transition 
probabilities subject to the non-negativity constraint at zero. 

We are ready to derive the equilibrium distribution of our 
system. We note that, if the channel state is ergodic and 
the queue is stable, then the Markov chain {U s } is positive 
recurrent and possesses a unique stationary distribution |13|. 
Let U = (C, Q) be a random vector with the following 
probability distribution, 

Pr(U = (c,g)) = lim Pv{U s = (c,q)). 

s— >oo 

We employ the semi-infinite vector tt as a convenient notation 
for the equilibrium distribution of our system, with 



n(2q + i) 



Pr(C = b,Q = q) ifi = l 
Pr(C =g,Q = q) if * = 2, 

for i e {1,2} and q G No- The states {(&, q), (g, q)} are 
known as the qth level of the Markov chain and Tr q = 
[7r(2g + 1) ir(2q + 2)] is the stationary distribution associated 
with the qth level. 

Using this compact notation, we can write the Chapman- 
Kolmogorov equations as ttT = tt, where T is the probability 
transition matrix associated with {U s }. One possible approach 
to solve for the stationary distribution of our Markov model 
is to employ spectral representation and ordinary generating 
functions [10|. In this article, we adopt an alternate means and 
apply the matrix geometric method lfT4l . ifTSl . 

A. Matrix Geometric Method 

We can represent the probability transition matrix T as a 
semi-infinite matrix of the form 

/ d Co ... \ 
A 2 Ai A 
A 2 Ai A •■■ 
A 2 Ai •■■ 
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(4) 
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where the submatrices Ci, Co, A 2 , Ai, and Ao are 2x2 
real matrices. More specifically, we have 
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Note that the Markov chain associated with (|4|i belongs to 
the class of processes with repetitive structure. The following 
theorem characterizes its stationary distribution. 

Theorem 4.1: Consider a positive recurrent Markov chain 
on a countable state space with transition matrix T given by 
©. Let the positive matrix R be defined as the limit, starting 
from Ro = 0, of the matrix recursion 

R i+ i = (Ao+RfAaXl-Ax)- 1 . 

Then, the gth-level stationary distribution n q satisfies ir q +i = 
TiqR. for g > 1 with m = ttqZ and 
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Corollary 4.2: The decay rate of the complementary cumu- 
lative distribution function of the queue satisfies 

lim t" 1 logPr(Q > r) = loge(R), 

T— J-OO 

where g(R) is the spectral radius of R. 

V. Performance Evaluation 

This mathematical characterization makes it possible to 
compute a wide range of advanced performance criteria for 
the system under consideration, including average packet error 
rate and outage capacity. Herein, we focus on two measures 
that are most relevant to delay-sensitive communications. First, 
we look at the probability that the queue exceeds a threshold, 
Pr(Q > t), where r is relatively small. Second, we examine 
the decay rate of the complementary cumulative distribution 
function, as discussed in Corollary 14.21 Again, we emphasize 
that the tail decay in buffer occupancy is given by the dominant 
eigenvalue of R. 

For illustrative purposes, we select the following parameters. 
The Gilbert-Elliot erasure channel is defined by a = 0.02, j3 = 
0.005, Ef, = 0.49, and e g = 0.0025. This generates an average 
erasure probability of 0.1. The channel memory decays at an 
exponential rate of (1 — a — /3) = 0.975. The blocklength 
is fixed at AT = 114 and the arrival process is defined by 
the arrival probability 7 = 0.25 and average packet length 
p^ 1 = 195. If codewords are transmitted every 4.615 ms, then 
this corresponds to an arrival rate of roughly 10.6 Kbits/sec 
and an ergodic channel capacity of roughly 22.2 Kbits/sec. 
These parameters are selected to loosely match the operation 
of a wireless GSM relay link. 

System performance as a function of the number of infor- 
mation bits per codeword, K, is shown in Fig. [3] Each curve 
represents the complementary cumulative distribution function 
evaluated at a different threshold value, Pr(Q > r). 

As expected, the probability of the queue exceeding a pre- 
scribed threshold decreases as r increases. More interestingly, 
it is instructive to notice that K = 83 appears uniformly 




Information Bits per Block, K 

Fig. 3. This figure shows tail probabilities in the equilibrium packet distribu- 
tion of the queue, Pr(Q > r), for threshold values r G {5, 10, 15, 20, 25}. 
The minimums occur uniformly at rN = 83 for all threshold values. 

optimal for all values of r. Further supporting evidence for 
this observation is offered by looking at the asymptotic decay 
rate in tail occupancy, displayed in Fig. |4] When the arrival 
rate is between 47.5 and 60, one finds that K = 83 is 

also optimal in terms of tail decay. This robustness property 
is very encouraging, as it simplifies system design. 

An important observation that does not appear on these two 
figures is the fact that, for short block lengths, the optimal 
value of K depends heavily on the channel parameters a, j3, e g 
and Eb- A naive conjecture would place K = rN close to the 
Shannon limit 0.9 x 114 = 102.6, but this is much larger than 
the optimal value of K = 83. A more sophisticated approach 
is to maximize the throughput of a system with an infinite- 
backlog. After some calculation, one finds that this leads to 
K = 87, which is much closer to the true optimum. But, as 
the channel memory parameter (1 — a — 0) varies, the optimal 
value of K changes substantially. In fact, as (1 — a — /3) — t 1, 
K approaches N. 



OA, 




Fig. 4. This figure shows tail decay rate, — lintr—joo r -1 logPr(Q > r), 
as a function of the number of information bits rN and the average arrival 
rate ~f/p r in bits per cycle. 



VI. Conclusions 

This work provides a unified approach that links queueing 
performance with the operation of a communication system at 
the physical layer. The methodology and results are developed 
for the Gilbert-Elliot erasure channel, but can be generalized to 
more intricate finite-state channels with memory. For example, 
the simple performance characterization of random codes over 
erasure channels extends naturally to hard-decision decoding 
of BCH codes over Gilbert-Elliot error channels. For fixed 
parameters, the optimal code rate appears relatively insensitive 
to target threshold r in the queue. Still, channel memory and 
cross-over probabilities can affect this optimal operating point. 
More generally, the optimal code rate seems to be linked to 
ratio between the codeword time and the coherence time of 
the channel. 
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